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Abstract 

We give the first comprehensive analysis of the effect of boundary conditions on the mixing time of 
the Glauber dynamics in the so-called Bethe approximation. Specifically we show that spectral gap 
and the log-Sobolev constant of the Glauber dynamics for the Ising model on an n-vertex regular 
tree with (+)-boundary are bounded below by a constant independent of n at all temperatures and 
all external fields. This implies that the mixing time is O(logn) (in contrast to the free boundary 
case, where it is not bounded by any fixed polynomial at low temperatures). In addition, our 
methods yield simpler proofs and stronger results for the spectral gap and log-Sobolev constant 
in the regime where there are multiple phases but the mixing time is insensitive to the boundary 
condition. Our techniques also apply to a much wider class of models, including those with hard- 
core constraints like the antiferromagnetic Potts model at zero temperature (proper colorings) and 
the hard-core lattice gas (independent sets) . 
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1 Introduction 



In this paper we will analyze the influence of boundary conditions on the Glauber dynamics for 
discrete spin models on a regular rooted tree. Although in what follows we will focus for sim- 
plicity on the well known Ising model, our techniques also apply to other models, not necessarily 
ferromagnetic and with hard-core constraints. 

In the Ising model on a finite graph G = (V, E), a configuration a = (a x ) consists of an assign- 
ment of ±l-values, or "spins", to each vertex (or "site") of V. The probability of finding the system 
in configuration a e {±l} y = ^Iq is given by the Gibbs distribution 



where f3 > is the inverse temperature and h the external field. Boundary conditions can also be 
taken into account by fixing the spin values at some specified "boundary" vertices of G; the term 
free boundary is used to indicate that no boundary condition is specified. 

In the classical Ising model, G = G n is a cube of side n l / d in the ci-dimensional Cartesian 
lattice Z d , and in this case the phase diagram in the thermodynamic limit G n | Z d is quite well 
understood (see, e.g., 11151 13811 for more background). 

While the classical theory focused on static properties of the Gibbs measure, in the last decade 
the emphasis has shifted towards dynamical questions with a computational flavor. The key object 
here is the Glauber dynamics, a (discrete- or continuous-time) Markov chain on the set of spin 
configurations flc in which each spin a x flips its value with a rate that depends on the current 
configuration of the neighboring spins of x, and which satisfy the detailed balance condition wr.t 
to the Gibbs measure [iq (see Section |2] for more details). 

The Glauber dynamics is much studied for two reasons: firstly, it is the basis of Markov chain 
Monte Carlo algorithms, widely used in computational physics for sampling from the Gibbs dis- 
tribution; and secondly, it is a plausible model for the actual evolution of the underlying physical 
system towards equilibrium. In both contexts, one of the central questions is to determine the 
mixing time, i.e., the time until the dynamics is close to its stationary distribution. 

As is well known (see e.g. fi36ll ). the approach to stationarity of a reversible Markov chain 
with Markov generator L and reversible measure it can be succesfully studied by analyzing two 
key quantities: the spectral gap and the logarithmic Sobolev constant of the pair (£, 7r)t. The first of 
these measures the rate of the exponential decay as t — > oo of the variance Var 7r (e tc /) computed 
with respect to the invariant measure it, while the second measures instead the rate of decay of 
the relative entropy of e tc f wr.t it (see, e.g., 01). Advances in statistical physics over the past 
decade have led to remarkable connections between these two quantities and the occurence of a 
phase transition (see, e.g., 11401 1301 1291 l9l 1281 126H ) . As an example, on finite n-vertex squares with 
free boundary in the 2-dimensional lattice I?, when h = and is smaller than the critical value 
P c , the spectral gap and the logarithmic Sobolev constant are 0(1) (i.e. bounded away from zero 
uniformly in n), while for (3 > f3 c they are both exponentially small in yfn. 

One of the most interesting and difficult questions left open by the above and related results 
is the influence of boundary conditions on the spectral gap and the log-Sobolev constant when 
h = and (3 > (3 C . It has been conjectured that, in the presence of an all-(+) boundary, the 
relaxation process is driven by the mean-curvature motion of interfaces separating droplets of the 
(—) -phase inside the (+)-phase, and therefore the mixing time should be polynomial in n (most 
likely n 2 / d log n) 1 1411 . In particular it has been argued that the spectral gap for the pure phases 
in high enough dimension should be 0(1). Proving results of this kind has proved very elusive, 

' Unfortunately the definition of the logarithmic Sobolev constant is not constant in the literature. The ambiguity 
arises because there are two definitions, one the inverse of the other. The definition used in this paper is the one that 
puts the logarithmic Sobolev constant and the spectral gap on the same footing. 
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and the only (presumably sharp) available bounds are upper bounds on the spectral gap and the 
logarithmic Sobolev constant Q. 

In this paper we prove a strong version of the above conjecture in what is known in statistical 
physics as the Bethe approximation, namely when the lattice 1 d is replaced by a regular tree. Among 
other results, we show that the spectral gap of the Glauber dynamics for the Ising model on a tree 
with a (+)-boundary condition on its leaves is 0(1) at all temperatures and all values of the external 
field, and further that the same holds for the logarithmic Sobolev constant. Notice that, with a free 
boundary, (3 large and h = 0, both quantities tend to zero as l/n a and the exponent a grows 
arbitrarily large as (3 — > oo Q . 

Ours is apparently the first result that quantifies the effect of boundary conditions on Glauber 
dynamics in an interesting scenario. We stress that, while the tree is simpler in many respects 
than Z d due to the lack of cycles, in other respects it is more complex due to the large boundary: 
e.g., it exhibits a "double phase transition," and the critical field at low temperature is non-zero 
(see below) . In the next subsection, we briefly describe the Ising model on trees before stating our 
results in more detail. 

1.1 The Ising model on trees 

Fix b > 2 and let T b denote the infinite 6-ary tree. The Ising model on T b is known II15II24II to have 
a phase diagram in the (h,@) plane quite different from that on the cubic lattice Z d (see Fig. 
and has recently received a lot of attention as the canonical example of a statistical physics model 
on a "non-amenable" graph (i.e., one whose boundary is of comparable size to its volume) — see, 

e.g., [S1[T§1II31E2CE1E1I3. 

f T=l/p 




Figure 1: The critical field h c {(3). The Gibbs measure is unique above the curve. 

Let us first discuss the behavior on the line h = 0. There is a first critical value (3q = \ log(^j), 
marking the dividing line between uniqueness and non-uniqueness of the Gibbs measure. Then, 
in sharp contrast to the model on Z d , there is a second critical point f3\ = \ log( ^jj + * ) which is 
often referred to as the "spin-glass critical point" II 1011 . This second critical point is such that, in 
the "intermediate temperature" region (3q < (3 < (3\, the (+)- and (— )-boundary conditions exert 
arbitrarily long-range influence on the spin at root of the tree and hence give rise to different Gibbs 
measures, but "typical" boundary conditions (i.e., chosen from the infinite volume Gibbs measure 
with free boundary) do not. Another way to phrase this peculiar behavior is that the Gibbs measure 
constructed via a free boundary is extremal for all (3 < (3\ (see (SJ 1191 1201 |3]l and also 11131 1331 13411 
for an analysis in the context of "bit reconstruction problems" for noisy data transmission) . 

Let us now examine what happens when an external field h is added to the system. It turns out 
that for all > (3$, there is a critical value h = h c ((3) > of the field such that the Gibbs measure 
is not unique when \h\ < h c , and is unique when \h\ > h c . (When (3 < /3q the Gibbs measure is 
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unique for all h, and h c is defined to be zero.) In the presence of a (+)-boundary, the Ising model 
on the tree with external held h = — h c is rather analogous to the classical case of Z d with zero 
field. Both models share the following two properties: firstly, the Gibbs measure is sensitive to the 
choice of boundary condition, and secondly, adding an arbitrarily small negative field causes the 
Gibbs measure to become insensitive to the boundary condition (i.e., unique in the thermodynamic 
limit). 

Finally we remark that the concentration properties of the Gibbs measure for (3 > (3q, h > — h c 
and (+)-boundary are very different from those on Z d . In the latter case, along the line of first 
order phase transition, the (negative) large deviations for the bulk magnetization are related to the 
appearance of a Wulff droplet of the opposite phase and are depressed by a negative exponential 
in the surface of the droplet (see, e.g., Illlll ). Here instead, for any value of h) they are always 
depressed by a negative exponential in the volume of the excess negative spins (the phenomenon 
of "rigidity of the critical phases" [ 5 ] ) . 

The Glauber dynamics for the Ising model on trees has also been studied. In a recent paper [3], 
it is shown that the associated spectral gap (see (J7J) for a precise definition) with zero external 
field and free boundary on a complete 6-ary tree T with n vertices is 0(1) at high and intermediate 
temperatures (i.e., when (3 < Moreover, at the critical point f3 = fix the same spectral gap is 
bounded above by c/logra, and as soon as (3 > j3\ it becomes smaller than c/n a ^\ with a(/3) f oo 
as (3 — > oo. Thus the critical point j3 = f}\ is reflected in the dynamics by an abrubt jump in the 
behavior of the spectral gap as a function of the size of the tree T. Finally, also in [ 3 ] , it is proved 
that the spectral gap for arbitrary fixed 0, h and boundary condition can never shrink to zero faster 
than an inverse polynomial in n. Again such a result should be compared to the lattice case where 

it is known that the spectral gap for a cube with n sites can be exponentially small in the surface 

„(d-i)/d < 

1.2 Main results and techniques 

Our first main result is a detailed analysis of the spectral gap of the Glauber dynamics in different 
regions of the phase diagram. The main novelty here is that we are able for the first time to prove 
a sharp result in the region where the spectral gap is highly sensitive to the boundary condition. 

Theorem 1.1 In both of the following situations, the spectral gap of the Glauber dynamics on a com- 
plete b-ary tree T with n vertices is 

(i) the boundary condition is arbitrary, and either (3 < fix (with h arbitrary), or \h\ > h c ((3) (with 
(3 arbitrary); 

(ii) the boundary condition is (+) and (3, h are arbitrary. 

Remark: On Z d not much is known about the spectral gap when (3 > f3 c , h = and the boundary condition 
is (+), the notable exception being that of I? where it has been recently proved [7] that the spectral gap in 
a square with n sites shrinks to zero at least as 1/y/n (neglecting logarithmic corrections). The best known 
lower bounds are significantly weaker 112 8 II . In high enough dimensions (d > 3) it has been conjectured (see 
11411 and Q) that the spectral gap should stay bounded away from zero uniformly in n. The above theorem 
can be looked upon as evidence in favor of this conjecture. 

In our second main result we extend our analysis to the more delicate and difficult logarithmic 
Sobolev constant (see Q for a precise definition) . 

* Actually the arguments in 1 3 1 prove that the gap is £1(1) for any /J < /3i, arbitrary boundary condition and any 
external field. Their argument, together with some monotonicity properties specific to the Ising model |35|, implies a 
mixing time of O(logro). Thus, although for (3q < /3 < fli there exist several Gibbs measures, the mixing time of the 
Glauber dynamics is insensitive to the boundary condition. 
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Theorem 1.2 In the same situations as in Theorem II. II the logarithmic Sobolev constant of the 
Glauber dynamics on a complete b-ary tree T with n vertices is f2(l). 

As a corollary we obtain that, in the situations of Theorems 11.11 and 11.21 the Glauber dynamics 
mixes (in a very strong sense) in time O(logn). 

Remarks: 

(i) In Z d with (+)-boundary condition, (3 large and zero external field the logarithmic Sobolev 
constant in a cube with n sites is always smaller than rc 2 l d , neglecting logarithmic correc- 
tions [7], in agreement with heuristic predictions based on mean-curvature motion of phases 
interfaces. 

(ii) We also prove (see Theorem I5.7D an additional result which shows that, for an arbitrary 
nearest- neighbor spin system on a tree, as soon as the spectral gap is 0(1) then the logarithmic 
Sobolev constant cannot shrink faster than (clogn) -1 . This means that, even when a constant 
lower bound is known for the gap but not for log-Sobolev, one can deduce a mixing time of 
0((logra) 2 ). While we do not require this fact to derive the results of this paper, we believe it 
may be of interest for other models on trees. 

In order to better appreciate Theorem ll.2l one should keep in mind that for general finite range, 
translation invariant, compact spin models on Z d , if there exists an infinite volume Gibbs measure 
\x with a positive logarithmic Sobolev constant, then the system is necessarily in the uniqueness 
region and has exponentially decaying correlations H41H §. We also recall (see, e.g., [25]) that 
when the log-Sobolev constant is bounded away from zero one can derive very strong (Gaussian- 
like) concentration properties of the corresponding Gibbs measure, such as those proved in [ 5 ] . 

We now proceed to sketch some of our techniques and point out the main technical innovations. 

Our analysis of both the log-Sobolev constant and the spectral gap rests on certain spatial mixing 
conditions that can be stated as follows. Let / be a function of the spin configuration that does 
not depend on the spins in the first £ levels of the tree starting from the root r, and let fi(f\a r ) 
be the projection of / onto the spin a r at the root. If the variance (respectively, the entropy) 
under the Gibbs measure fi of fi(f \ a r ) decays fast enough with the depth £, then we show by a 
unified argument how to deduce a bound of f2(l) on the spectral gap (respectively, the log-Sobolev 
constant) . Crucially, in contrast to previous approaches we do not require the above decay to hold 
in arbitrary environments, but only for the Gibbs measure fi under consideration. This opens up 
the possibility that the condition holds for some boundary conditions and not for others (with the 
same values of temperature and external field). We also prove the converse, thus showing the 
that our mixing conditions are in fact equivalent to the required bounds on the spectral gap and 
log-Sobolev constants. 

This analysis has several advantages over previous ones |j3l 13 511 : it is more direct, applies also 
when there is an external field, and applies to general nearest-neighbor spin systems on trees. 

The second main ingredient of the paper is establishing the above spatial mixing conditions in 
the scenarios of interest described in the above two theorems. This is done via a rather simple 
and novel coupling technique for the case of the variance. Such a technique provides, along the 
way, a new and really elementary proof of the extremality of the Gibbs measure with free boundary 
below Pi. 

Surprisingly, we are also able to exploit the same coupling technique (via strong concentration 
properties of the Gibbs measure) to establish the entropy mixing condition. Thus in terms of the 
coupling analysis our conditions for variance and entropy mixing are essentially the same. 

§ A close look at the proof in 1411 reveals that the same is true for any infinite, locally finite, bounded degree graph 
such that the volume of any ball of radius £ grows sub-exponentially in £. 
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Finally, we mention that our results actually hold (with suitable modifications) for a much wider 
class of spin systems on trees than just the Ising model, including the Potts model and models with 
hard constraints such as the zero-temperature antiferromagnetic Potts model (proper colorings) 
and the hard-core lattice gas model (independent sets) . We briefly outline some of these extensions 
at the end of the paper; full details can be found in a companion paper 113 111 . 

The remainder of the paper is organized as follows. In Section|2]we give some basic definitions 
and notation. Then in Section |3] we define the spatial mixing conditions and relate them to the 
spectral gap and log-Sobolev constant. The mixing conditions in the scenarios of interest for the 
spectral gap and the log-Sobolev constant are verified in Sections [4] and |5] respectively. Finally, in 
Section |6] we mention some extensions of our results to other models of interest. The proofs of 
some technical lemmas omitted from the main text are collected in a supplement, Section[7| 
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2 Preliminaries 

2.1 Gibbs distributions on trees 

For b > 2, let T b denote the infinite, rooted 6-ary tree (in which every vertex has b children). 
We will be concerned with (complete) finite subtrees T of T b ; if T has depth m then it has n = 
(b m+l — l)/(6 — 1) vertices, and its boundary dT consists of the children (in T b ) of its leaves, i.e., 
\dT\ = b m+1 . We identify subgraphs of T with their vertex sets, and write E{A) for the edges within 
a subset A, and dA for the boundary of A (i.e., the neighbors of A in (T U dT) \ A). 

Fix an Ising spin configuration r on the infinite tree T b . We denote by Qj, the set of (finite) spin 
configurations a G {±l} Tu9T that agree with r on dT; thus r specifies a boundary condition on T. 
Usually we abbreviate QJp to O. For any rj € $7 and any subset A C T, we denote by n A the Gibbs 
distribution over ft conditioned on the configuration outside A being rj: i.e., if a G $7 agrees with n 
outside A then 

u V a(<j) oc exp \B[ > cr x a v + h? a x ) , 

where (3 is the inverse temperature and h the external field. We define yP A {a) = otherwise. In 
particular, when A = T, yUy is simply the Gibbs distribution on the whole of T with boundary 
condition r; we abbreviate /ij, to \i. 

For a function /:J]^Mwe denote by fi A (f) = J^o-en / u A( cr )/( cr ) tne expectation of / w.r.t. the 
distribution n A . It will be convenient to view n A (f) as a function of -n, defined by //a (/)(??) = ^a(/)> 
the conditional expectation of /. Note that ^a(I) is a function from f! tol but depends only on 
the configuration outside A. We write Var^(/) = n A (f 2 ) - fi A (f) 2 and (for / > 0) Ent^(/) = 
[i A (f log f) — n A (f) log fj, A (f) for the variance and entropy of / respectively w.r.t. fi v A . Note that 
Var^(/) = iff, conditioned on the configuration outside A being rj, f does not depend on the 
configuration inside A. The same holds for Ent^(/). In case A = T we use the abbreviations 
M (/),Var(/) and Ent(/). 

We record here some basic properties of variance and entropy that we use throughout the paper: 
(i) For B C A C T, 

Var^C/) = /^[Var B (/)] + Var> B (/)]. (2) 
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This equation expresses a decomposition of the variance into the local conditional variance in B 
and the variance of the projection outside B. 

(ii) If A = \J i Ai for disjoint Ai, and the Gibbs distribution pP A is the product of its marginals over 
the Ai, then for any function /, 

Var^/)< J>^[Var Ai (/)]. (3) 

i 

(iii) For any two subsets A,B C T such that (dA) n B = 0, and for any function /, 

^VaTA(Mf))} < n\yax A (jiAnB(f))]. (4) 

Properties (ii) and (iii) are consequences of the fact that variance w.r.t. a fixed measure is a convex 
functional. 

All three properties (i), (ii) and (iii) also hold with Var replaced by Ent. 



2.2 The Glauber dynamics 

The Glauber dynamics on T with boundary conditions r is the continuous time Markov chain on = 
Uj, with Markov generator C = C7 T given by 

(£f)(a) = Y J C x (a)[f(a*)-f(a)}, (5) 

where a x denotes the configuration obtained from a by flipping the spin at the site x, and c x (a) 
denotes the flip rate at x. Although all our results apply to any choice of finite-range, uniformly 
positive and bounded flip rates satisfying the detailed balance condition w.r.t. the Gibbs measure, 
for simplicity in the sequel we will work with a specific choice known as the heat-bath dynamics: 

c x (a) = tf x Acr x ) = 1 , where w x (a) = exp[2/3cr x ( V ff y + h)] . 

1 + w x (a) L ' J 

It is a well-known fact (and easily checked) that the Glauber dynamics is ergodic and reversible 
w.r.t. the Gibbs distribution \i = n^, and so converges to the stationary distribution \i. The rate of 
convergence is often measured using two concepts from functional analysis: the spectral gap and 
the logarithmic Sobolev constant. For a function / : U — > R, define the Dirichlet form of / associated 
with the generator £ by 

V(f) : =^^ Cx [f(a x )-f(a)] 2 ) = ^ ^(Var w (/)). (6) 

x x 

(The l.h.s. here is the general definition for any choice of the flip rates c x ; the last equality holds 
when specializing to the case of the heat-bath dynamics.) The spectral gap c ga p(/x) and the logarith- 
mic Sobolev constant c so b(p,) of the chain are then defined by 



v{f) , , . f p(v7) 



Cgap(At) = inf 77^77^; c sob (/i) = inf ^77^ , (7) 



where the infimum in each case is over non-constant functions /. 

As is well known, these two quantities measure the rate of exponential decay as t — ► 00 of the 
variance and relative entropy respectively (see, e.g., H36I0 . The quantity c gap also has a natural 
interpretation as the smallest positive eigenvalue of —C. 

We make the following important note. When discussing the asymptotics of c so b (or c gap) for a 
fixed boundary condition r, we think of the infinite sequence of Gibbs distributions {fJ^}, where T 
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ranges over all finite complete subtrees of T b . In particular, when we say that c so t>(ii) = c so b(^) = 
0(1) we mean that there exists a finite constant C > such that for every T (or equivalently for 
every /u E c sob (/u) > 1/C. 

We close this section by recalling some well-known relationships between the above constants 
and certain notions of mixing time of the Glauber dynamics. Define faf(r/) = ^fep , where 
Pt{(7, rj) := e tc (a, rj) is the transition kernel at time t. Then, for 1 < p < oo, define 

T p := minji > : sup \\h u t - l\\ p < -) (8) 

where ||/|| p denotes the L p (f2, //) norm of /. The time T\ is usually called simply the mixing time of 
the chain. Standard results relating T p to the spectral gap and log-Sobolev constant (see, e.g., [36]), 
when specialized to the Glauber dynamics, yield the following: 

Theorem 2.1 On an n-vertex b-ary tree T with boundary condition t, 

(i) c gap (ii) _1 < T\ < Cgap^)- 1 x dn; 

(ii) c gap (ii) _1 <T 2 < c sob (ii) _1 x C 2 logn, 

where p, = ii T T and C\ , C 2 are constants depending only on b, (3 and h. □ 

Finally, we note that our choice of the heat-bath dynamics is not essential. Since changing to 
any other reversible local update rule (e.g., the Metropolis rule) affects c so b and c gap by at most a 
constant factor, our analysis applies to any choice of Glauber dynamics. 



3 Spatial mixing conditions for spectral gap and log-Sobolev 

In this section we define a certain spatial mixing condition (i.e., a form of weak dependence between 
the spin at a site and the configuration far from that site) for a Gibbs distribution fi, and prove 
that this condition implies that c gap (/i) = $7(1). An analogous condition implies that c so b(/i) = 
fi(l). Our spatial mixing conditions have two main advantages over those used previously: first, 
the conditions for the spectral gap and the log-Sobolev constant are identical in form, allowing a 
uniform treatment; second, and more importantly, they are measure-specific, i.e., they may hold 
for the Gibbs distribution induced by some specific boundary configuration while not holding for 
other boundary configurations. Hence, the conditions are sensitive enough to show rapid mixing 
for specific boundaries even though the mixing time with other boundaries is slow for the same 
choice of temperature and external field. We also note that the results of this section hold not just 
for the Ising model but for any nearest-neighbor interaction model on a tree. 



3.1 Reduction to block analysis 

Before presenting the main result of this section, we need some more definitions and background. 
For each site x G T, let B Xj # C T denote the subtree (or "block") of height £ — 1 rooted at x, i.e., 
B X) £ consists of I levels. (If x is k < £ levels from the bottom of T then B x ^ has only k levels.) 
In what follows we will think of £ as a suitably large constant. By analogy with expression © 
for the Dirichlet form, let T>e(f) = YlxeT l^[^ av B x e (f)] denote the local variation of / w.r.t. the 
blocks {B x /\. A straightforward manipulation (see, e.g., [28], keeping in mind that each site 
belongs to at most £ blocks) shows that c gap can be bounded as follows: 

c gap ( M ) > \ ■ inf . rnmc gap 04j. (9) 



7 



As before, the infimum is taken over non-constant functions (and henceforth we omit explicit men- 
tion of this) . The importance of © is that min,^ c gap (n B t ) depends only on the size of B x j and (3, 

but not on the size of T; in fact, it is at least £l(e~ c ( b '^' e ) [3]. Therefore, in order to show that c gap 
is bounded by a constant independent of the size of T, it is enough to show that, for some finite i, 
Var(/) < const x T>f(f) for all functions /. This is what we will show below, under the relevant 
spatial mixing condition. As a side remark, notice that inf j V ar(/) * s exactr y the spectral gap of the 
Glauber dynamics based on flipping blocks B x ^, rather than single sites x. 

An identical manipulation yields an analogous bound for the log-Sobolev constant. For a non- 
negative function /, let^(/) = ExeT/4 Ent i^(/)]- Then 

1 £ ( f ) 

CsoM > j ■ mf • minc sob (^, £ ). (10) 

Hence to bound c so b(/x) it suffices to show that, for some constant £, Ent(/) < const x £e(f) for 
all / > 0. 



3.2 Spatial mixing 

We are now ready to state our spatial mixing conditions, first for the variance and then for the 
entropy. For x & T, write T x for the subtree rooted at x, and T x for T x \ {x}, the subtree T x 
excluding its root. 

Definition 3.1 [Variance Mixing] We say that fi = satisfies VM(£, e) if for every x e T, any rj e 
VLTp and any function f that does not depend on B x i, the following holds: 

Va4>£(/)] < £-Va4 x (/). 

Let us briefly discuss the above condition. Essentially, e = e(£) gives the rate of decay with 
distance I of point-to-set correlations. To see this, note that the l.h.s. Var^ (/)] is the variance 
of the projection of / onto the root x of T x , which is at distance £ from the sites on which / depends. 
It is also worth noting that the required uniformity in rj in VM is not very restrictive: since the 
distribution ^ depends only on the restriction of r] to the boundary of T x , and since -n G Oj, 
(i.e., rj agrees with r on dT and therefore on the bottom boundary of T x ), the only freedom left 
in choosing r/ is in choosing the spin of the parent of x. Thus, VM is essentially a property of 
the distribution induced by the boundary condition r. It is this lack of uniformity (i.e., the fact 
that we need not verify VM for other boundary conditions) that makes it flexible enough for our 
applications. 

As the following theorem states, if VM(£, e) holds with e ~ then we get a lower bound 

on c gap : 

Theorem 3.2 For any £ and 5 > 0, if /z satisfies VM(£, (1 - 5) /2(£ + 1-5)) then Var(/) < f • V e (f) 
for all f. In particular, if VM with the above parameters holds for some fixed £ and 5 > 0, for all 
H = [i T T with T a full subtree, then c gap (//) = 0(1). Conversely, ifc gap (^) = 0(1) then for all T, 
satisfies YM(£, ce~ M ) for some constants c, •& > and all i. 

Remark: The second part of the theorem was already proved in 0, where it was shown that for general 
nearest-neighbor spin systems on any bounded degree graph, if c gap (^) is bounded independently of n 
then /U exhibits an exponential decay of point-to-set correlations (i.e., VM(£, cexp(— ■&£)) holds for all £). The 
authors of [3 ] posed the question of whether the converse is also true. Theorem l3 .21 f which holds for general 
nearest-neighbor spin systems on a tree) answers this question affirmatively when the graph is a tree. In fact, 
as is apparent from the above theorem, the decay of point-to-set correlations on a tree is either slower than 
linear or exponentially fast. 
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The analogous mixing condition for entropy and the log-Sobolev constant is the following: 

Definition 3.3 [Entropy Mixing] We say that fj, = Hj, satisfies EM(£, e) if for every x G T, any i] G 
fl'Jp and any non-negative function f that does not depend on B x ^, the following holds: 

Er4>j,(/)] < e-Ent?. (/). 

Before stating the analog of Theorem 13.21 relating c so b to EM, we need to define one more 
constant. Letp m i n = min^^gf^ ^ x {a x = s), where s ranges over {+, — }; i.e., p m i n is the minimum 
probability of any spin value at any site with any boundary condition. It is easy to see that p m ; n > 
i e -2^(6+|/i|) ^ a cons tant depending only on b, j3, h. 

Theorem 3.4 Forany£and5> 0, if fi satisfies EM{£, [(l-5)p min /(£+l-5)} 2 ) then Ent(/) < 
for all f > 0. In particular, if EM with the above parameters holds for some fixed I and 5 > 0, for all 
pL = [iTp with t fixed and T an arbitrary full subtree, then c so b(/u) = 0(1). Conversely, ifc so \,(pL) = 0(1) 
then for all T, ^iTp satisfies EM(£, ce~ M )for some constants c, $ > and all I. 

In order to prove Theorems 13 .21 and 13 .41 it is convenient to work with spatial mixing conditions that 
are somewhat more involved than VM and EM. The main difference is that we want to allow for 
functions that may depend on B Xj £ (the first t levels of T x ) and thus need to introduce a term for 
this dependency. The modified conditions express the property that the variance (entropy) of the 
projection of any function / onto the root x of T x can be bounded up to a constant factor by the 
local variance (entropy) of / in B Xt £, plus a negligible factor times the local variance (entropy) of / 
in T x . As the following lemma states, the modified conditions (with appropriate parameters) can 
be deduced from VM and EM. 

Lemma 3.5 (i) For any e < \, if fi = pi T T satisfies YM(£, e) then for every x G T, any rj G 

and any function f we have Var^^jr (/)] < • ^ x [Var Bx e (f)] + ^ • ^ [Var^r (/)], 
with e' = 2s. 

(ii) For any e < p^ in , if \i = ii T T satisfies EM(£, e) then for every x G T, any r/ G OJ, and any 
function f>0we have Ent^ [n % (/)] < ^ • p\ [Ent B ^ (/)] + ^ • ^ [Emy (/)], with e' = 

Pmin 

Remark: We note that with extra work, part (ii) of Lemma l375l can be improved to hold with e' — c(p m i n )e. 
We give the weaker bound because it is simpler to prove while still enough for our applications. 

Similar statements to those in Lemma 13751 appeared in [4]. We defer our proof to Section[7| 
We can now prove Theorems l3 .2l and l3 .4l bv working with the modified spatial mixing conditions 
of Lemma l3~5l 

Proof of Theorems 13 .21 and I3.4t Here we only prove the forward direction of both theorems. The 
reverse direction of Theorem l3.2l was proved in 0, as already mentioned above. The proof of the 
reverse direction of Theorem l3.4l is deferred to Section[7|because it uses machinery developed later 
in the paper. 

The main step in the proof of the forward direction is to show the following claim: 
Claim 3.6 If for every x G T, any r/ G OJ and any function f, 

Var^^r (/)] < c-^[VaxB M (/)] + (^) • ^.[Var^ (/)], 

then Var(/) < | • T>t{f) for all f. The same implication holds when Var is replaced by Ent, T>£ is 
replaced by Eg and the function f is restricted to be non-negative. 
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Observe that the hypothesis of Theorem 13.21 together with part (i) of Lemma 13.51 establishes 
the hypothesis of Claim l3~6l with c < 3, and similarly, the hypothesis of Theorem 13 .41 together with 
part (ii) of Lemma 13.51 establishes the hypothesis of Claim l3~6l ("after the necessary replacement of 
symbols) with c < 2. 

It therefore suffices to prove Claim 13.61 We prove only the formulation with Var and T>f since 
the proof for the formulation with Ent and St is identical once we make the same replacements in 
the text of the proof. As will be clear below, the proof uses only properties which are common to 
both Var and Ent. 

Consider an arbitrary function / : U — > R. Our first goal is to relate Var(/) to the projections 
Var ^ [fi^r (/)] for x G T, so that we can apply the spatial mixing condition of the hypothesis. Recall 
that T has m + 1 levels, and define the increasing sequence = Fq C F\ C . . . C F m+ \ = T, where 
Fi consists of all sites in the lowest i levels of T. Thus Fi is a forest of height i — 1. Using © 
recursively, and the facts that HF i+1 {HFi(f)) = HF i+1 (f) and HF (f) = f, we obtain 

Var(/) = M [Var Fl (/)]+Var[ m (/)] 

= /i[Var Fl (/)] + fj,[Yar F2 (fj, Fl (f))} + Var[^ F2 (/i Fl (/))] 

m+l 

= J>[Var Fi ( MiVl (/))]. 

i=i 

Now a fundamental property of nearest-neighbor interaction models on a tree is that, given the 
configuration on T \ Fi, the Gibbs distribution on Fi becomes a product of the marginals on the 
subtrees rooted at the sites x G Fj\Fj_i. Using inequality © for the variance of a product measure, 
we therefore have that 

m+l 

Var(/)<^ Yl ^T^F 1 AM<Y.^ T ^%(f))], (ID 

where in the second inequality we used the convexity of the variance as in © . 

Notice that so far we have not used the spatial mixing condition in the hypothesis of Claim l3~6l 
but only a natural martingale structure induced by the tree. Let us denote the final sum in (II II) 
by Pvar(/). In order to bound c gap , we need to compare the projection terms Var^^^r (/)) 
in Pvar(/) with the local conditional variance terms in T>g(f). For example, notice that if fi were 
the product of its single-site marginals then VarT^/x^r (/)) < ^t x [Var x (/)] and c gap = 1. However, 
in general the variance of the projection on x may also involve terms which depend on other sites, 
and may lead to a factor that grows with the size of T x . We will use the spatial mixing condition 
in order to preclude the latter possibility. Specifically, we show that if for every x G T, any r/ G VLTp 
and any function g, Var^ [jujr (g)} < c ■ ^ [Var^ e (g)] + e ■ [Var^r (g)] then for every x G T 
and r] G Q, 

Va4>jr(/)] < c-^jVar^ (/)]+£• £ ^[Var Ty ( M? r (/))], (12) 

yeB x UdB x ,y^x 

where we have abbreviated B x n to B x and dB x stands for the boundary of B x excluding the parent 
of x, i.e., the bottom boundary of B x . Notice that the last term in (1121) is relevant only when x 
is at distance at least £ from the bottom of T. When x belongs to one of the £ lowest levels of T 
then T x = B x , and thus trivially Var^jr (/)] < [i\ [Vav Bx (f)}. 

Let us assume (1121) for now and conclude the proof of the theorem. Applying (1121) for every x 
and t], and using the hypothesis that e = and the fact that each site appears in at most £ blocks, 
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we get 

Pvar(/) < c -V £ (f)+s-J2 £ /4Var Ty (/))] 

x ^ T yeB x u8B x ,y^x 

= c-V t (f) + (l-6)Pvax{f), 

and hence 

Var(/) < Pvar(/) < yV t {f), 

proving Claim 13761 We now return to proving (1121) . 

Let g = n T >, g u q B Jf). Once we notice that (J,f(f) = fJ-jr(g), we can use the spatial mixing 
assumption that precedes (1121) to deduce 

Var^ (/)] < c • [Var Bi (5)] + e ■ ii\ [Varjr (5)] 

< c-i4, w \Vai Bx (/)]+£• A4. i Yav T x (^)l ' 

where we used @ for the second inequality. We will be done once we show that 

/^[VarjrG?)] < £ ^ [Var Ty (a^ (/))]. (13) 

y£B x UdB x ,y^x 

But (1131) follows from a similar argument to that used earlier to show Var(/) < Pvar(/), starting 
from the fact that g = p,p'(f), where the forests F[ are defined analogously to the F{ earlier but 
restricted to the subtree T x , and k = height(x) — I. We omit the details. 

This concludes the proof of Claim l3~6l and thus of Theorems 13 . 21 and 13 .41 □ 



4 Verifying spatial mixing for the spectral gap 

In this section, we will prove that the spectral gap of the Glauber dynamics is bounded in all of the 
situations covered by Theorem ll.il in the Introduction. 

In light of Theorem 13.21 to bound the spectral gap it suffices to verify the Variance Mixing 
condition VM(£,e) with e = (1 — S)/2(£ + 1 — 5), for some constants £, 5 > independent of the 
size of T. In fact, we will show it with the asymptotically tighter value e = cexp(— $£): 

Theorem 4.1 In both of the following situations, there exists a positive constant $ (depending only on 
b, (3 and h) such that, for all T, the Gibbs distribution = satisfies VM(£, e~ M )for all £: 

(i) r is arbitrary, and either /3 < (3% (with h arbitrary), or \h\ > h c (f3) (with f3 arbitrary); 

(ii) r is the (+)-boundary condition, and (3, h are arbitrary. 
As a corollary, in both situations c ga p(^) = 0(1). 

Remark: The validity of VM, i.e, the decay of point-to-set correlations, is of interest independently of its 
implication for the spectral gap (an implication which is new to this paper): e.g., it is closely related to the 
purity of the infinite volume Gibbs measure and to bit reconstruction problems on trees [13]. In the special 
case of a free boundary and h = 0, part (i) of Theorem l4.1l was first proved in |0 via a lengthy calculation, 
which was considerably simplified in II 1911 . It was later reproved in 1 3 ] (for arbitrary boundary conditions) 
as a consequence of the fact that the spectral gap is bounded in this situation. An extension to general 
trees can be found in H13H and [ 20 ] . Our motivation for presenting another proof of part (i) (in addition to 
handling general fields h) is the simplicity of our argument compared with previous ones. As far as part (ii) 
is concerned, we are unaware of any previous results for the case of the (+) -boundary other than the fact 
that VM(£, e{t)) must hold with lim^oo e(£) = because the (+)-phase is pure (see, e.g., 111510 . 
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The rest of this section is divided into two parts. First, we develop a general framework based 
on coupling in order to establish the exponential decay of point-to-set correlations. This framework 
identifies two key quantities, k and 7, and states that when their product is small enough then VM 
holds. Then, in the second part, we go back to proving Theorem l4.1l bv calculating k and 7 for each 
of the above two regimes separately. 

4.1 A coupling argument for decay of point-to-set correlations 

In this section we develop a coupling framework that enables us to verify the exponential decay of 
point- to-set correlations from a simple calculation involving single-spin distributions. 

First we need some additional notation. When x is not the root of T, let /x^ (respectively, ) 
denote the Gibbs distribution in which the parent of x has its spin fixed to (+) (respectively, (— )) 
and the configuration on the bottom boundary of T x is specified by r (the global boundary condition 
on T) For two distributions fi\ and 1x2, we denote by ||/xi — ^\\x the variation distance between 
the projections of /xi and ^2 onto the spin at x. (Since the Ising model has only two spin values, 
II A 4 1 — M2IU = l/^i (fi = +) — Viipx = +)!•) Recall also that rf denotes the configuration rj with the 
spin at site y flipped. 

We now identify two constants that are crucial for our coupling argument: 

Definition 4.2 For a sequence of Gibbs distributions {/i^} corresponding to a fixed boundary condi- 
tion t, define k = k({/j>t}) an d 7 = 7({a*t}) by 

(i) k = sup T max z \\fjif — || z ; 

(ii) 7 = sup T max ||//^ — fjP^\\ z , where the maximum is taken over all subsets ACT, all boundary 
configurations rj, all sites y on the boundary of A and all neighbors z e Aofy. 

Note that k is the same as 7, except that the maximization is restricted to A = T z and the boundary 
vertex y being the parent of z; hence always k < 7. Since k involves Gibbs distributions only 
on maximal subtrees T z , it may depend on the boundary condition r at the bottom of the tree. 
By contrast, 7 bounds the worst-case probability of disagreement for an arbitrary subset A and 
arbitrary boundary configuration around A, and hence depends only on (/?, h) and not on r. It is 
the dependence of k on r that opens up the possibility of an analysis that is specific to the boundary 
condition. For example, at very low temperature and with no external field, k is close to 1 in the 
free boundary case, while it is close to zero in the (+)-boundary case. 

In our arguments k will be used to bound the probability of a disagreement percolating one level 
down the tree, namely, when we fix a disagreement at x and couple the two resulting marginals on 
a child z of x. On the other hand, 7 will be used in order to bound the probability of a disagree- 
ment percolating one level up the tree, namely, when we fix a single disagreement on the bottom 
boundary of a block, say at y (with the rest of the boundary configuration being arbitrary), and 
couple the marginals on the parent of y. 

The novelty of our argument for establishing VM comes from the fact that we identify two 
separate constants k and 7, and consider their product, rather than working with k alone: 

Theorem 4.3 Any Gibbs distribution /i = ^ satisfies VM(£, (jKb) e ) for all t, where k and 7 are the 
constants associated with the sequence {/i^} as specified in Definition 14.21 In particular, if jnb < 1 
then there exists a constant 1? > such that, for every T, the measure n = satisfies VM(^, e~ m )for 
all t, and hence c gap (^) = 0(1). 



^Notice that we do not specify the rest of the configuration outside T x since it has no influence on the distribution 
inside T x once the spin at the parent of x is fixed. However, since our distributions are defined over the whole con- 
figuration space, in the discussion below when the configuration outside T x is relevant it will be understood from the 
context. 
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Proof: Fix arbitrary T, x 6 T, ij £ Q7p. We need to show that for every function / that does 
not depend on B x ^, Var^, [p-jr(f)] < e • Var^ (/) with e = (Kjb) 1 , i.e., projecting / onto the root 
(of T x ) causes the variance to shrink by a factor e. As is well known, it is enough to establish a dual 
contraction, i.e., to consider an arbitrary function that depends only on the spin at the root and 
show that, when projecting onto levels £ and below, the variance shrinks by a factor s. Formally, it 
is enough to show that for every function g that does not depend on T X W we have 

Va4>^( 5 )] < s-VaxlJg). (14) 

This is because for a function / that does not depend on B x ^, the variance of the projection can be 
written as 

Va4>5r(/)] = Cov^(/,^(/)) = Co4 B (/ jM B.>5j;(/))) < 



Var^ (/) • Var^ \p BiaA (/jjr (/)] , 

where Cov^ (/,/') denotes the covariance n v A (ff) — mK/VK/O an d tne l ast inequality is an 
application of Cauchy-Schwartz. We then have 

Varl [ur Ju^r(f))] 

v^ >s( /)]<v<(/). tiff] ■ 

If we assume (I14D then the expression on the r.h.s. is bounded by e ■ Var^ (/) since g = /ijr (/) 
does not depend on T x . 

We therefore proceed with the proof of (I14D . which goes via a coupling argument. A cou- 
pling of two distributions p,\,fj,2 on £1 is any joint distribution v on $7 2 whose marginals are /xi 
and fj,2 respectively. For two configurations a, a' e ri, let |cr — er'l^ denote the Hamming distance 
between the restrictions of a and a' to dB Xj £, i.e., the number of sites at distance £ below x at 
which a and a' differ. Notice that |cr — cr'\ x ,e can De at most b e , the number of sites on the £th 
level below x. Let /it (respectively, //—) stand for the Gibbs distribution where the spin at x is 

set to (+) (respectively, (— )) and, as usual, the configuration on the bottom boundary of T x is 
specified by r. Our goal will be to construct a coupling u of /ii and p~ for which the expectation 

KW - cr'\x,£ = J2a,a> z/ (°", cr 0l <T ~ a '\x,i *S Only 

Claim 4.4 For every ieT and all £ the following hold: 

(i) There is a coupling v of p,~ and /i~ /or which E u \a — a'\ Xi e < (Kb) . 

(ii) For any t], rj' £ ft that have the same spin value at the parent of x, \\f/g —^b JU < 7 £ -|n— VU,£- 

Let us assume Claim l4~4l for the moment and complete the proof of (1141) . Consider an arbitrary g 
that does not depend on T x . Let p = pl^ (a x = +) and q = 1 — p = ^ (o x = — ). We also write g + 
for g(a), where a is any configuration that agrees with rj outside T x and such that a x = +. (This 
is well defined since g does not depend on T x ). We define g~ similarly. Without loss of generality 
we may assume that in the coupling v from Claim R~4l both the coupled configurations agree with 77 

"Effectively this means that, conditioned on the configuration outside T x being r\, g depends only on the spin at the 
root x. 
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outside T x with probability 1. We then have 

= Covl [g,fJ,jr(nB x A9))] 





-g 




pq(g + 


-g~ 


)J2^<r')l»% x /g)-»B x /g)] 


pq\g + 


- g~ 


\^2v(<r,o / )\\li% mit -l4 a Jx' \g^ 


pq(g + 


-g~ 





(15) 



= 7 • Var^(g) • E u \a - a'\ X) e 
< ( 7 «6)<.Va4». 

In the sixth line here we have used part (ii) of Claim |4~4*1 and in the last line we have used part (i). 
This completes the proof of A14D . and hence of Theorem l4.3l We thus go back and prove Claim l4~4l 
The proof of Claim l4~4l makes use of a standard recursive coupling along paths in the tree (as 
in, e.g., Q). We start with part (i), i.e., constructing a coupling v of fi~ and fj,~ with the required 

Tx Tx 

properties. Since the underlying graph is a tree, we can couple p,~ and /i~ recursively. This goes 

Tx T x 

as follows. First, given the spin at x the measures on T z (where z ranges over the children of x) are 
all independent of each other, so we can couple the projections on the T z 's independently. Then, 
we couple the two projections on T z by first coupling the spin at z using the optimal coupling (the 
one that achieves the variation distance) of the marginal measures on the spin at z. Thus, the spins 
at z disagree with probability at most k. Once a coupled pair of spins at z is chosen, we continue as 
follows: if the spins at z agree then we can make the configurations in T z equal with probability 1 
(because the two boundary conditions are the same); if the spins at z differ (i.e., one is (+) and the 
other (— )) then we recursively couple /ii and fi~. We let v be the resulting coupling of fj,~ and 

Tz T z T x 

and notice that E v \a — cr'\ x ,i < (Kb) £ since for every site y at distance £ below x the probability 

T x 

that the two coupled spins at y disagree is at most k £ . 

We go on to prove part (ii) of Claim 14.41 First, by writing a telescopic sum and applying the 
triangle inequality we get that 
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where k = \rj — v'\x,£ an d the sequence of configurations rf l > is a site-by-site interpolation of the 
differences between rj and rj in dB x ^. (It suffices to interpolate only over the differences in dB x ^ 
since the measure fi B ( depends only on the configuration in dB x ^ and since rj and r( agree on the 

parent of x.) It is now enough to show that \\fi B — /jP b \\ x < j e for all ry and w G dB x £. This, 
however, follows by a coupling argument as before, where this time we couple recursively along 
the path from w to x (i.e., up the tree). Specifically, suppose by induction that in our coupling there 
is already a path of disagreement going from w to y, where y is some site on the path from w to x. 
Let z denote the parent of y. At the next step we choose a coupled pair of spins at z from the two 
distributions ^\ and plP A (using an optimal coupling for the projections onto the spin at z), where 
the subset A is B x g excluding the path from w to y. The probability of disagreement at z given the 
disagreement at y is then bounded by 7, by definition. If the resulting spins at z agree then the 
spins on the rest of the path are coupled to agree with certainty, while if there is a disagreement 
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at z we continue recursively starting from the disagreement at z. We therefore conclude that the 
probability of disagreement at x in the resulting coupling is as required. □ 

Remark: We emphasize that Theorem 14.31 is not specific to the Ising model and generalizes to arbitrary 
nearest-neighbor models on a tree. Although we used the fact that the Ising model has only two possible spin 
values, the proof can easily be generalized to more than two spin values at the cost of a factor y— in front 
of (7«6) f in VM, where p m ; n is the minimum probability of any spin value as defined just before Theorem l3.4l 
Thus, since Theorem 13 .21 also applies to general nearest-neighbor spin systems on a tree, we conclude that 
the implication from ^nb < 1 to a bounded c gap (^) holds for any such system (with the definitions of k and 7 
extended in the obvious way to systems with more than two spin values) . The details can be found in the 
companion paper 113 111 . 



4.2 Proof of Theorem IPI 

In this section we go back to proving Theorem 14.11 Using Theorem 14.31 all we need to do for 
the given choices of the Ising model parameters is to bound k and 7 as in Definition 14.21 such 
that 7K6 < 1. In contrast to Sections l3l and |4~T1 which apply to general nearest- neighbor spin 
systems on trees, here the calculations are specific to the Ising model. 

For both k and 7, we need to bound a quantity of the form \\/j, A — fi A \\ z , where y e d A and 
z G A is a neighbor of y. The key observation is that this quantity can be expressed very cleanly in 
terms of the "magnetization" at z, i.e., the ratio of probabilities of a (— )-spin and a (+)-spin at z. It 
will actually be convenient to work with the magnetization without the influence of the neighbor y: 
thus we let fj, A ,y ~* denote the Gibbs distribution with boundary condition rj, except that the spin 
at y is free (or equivalently, the edge connecting z to y is erased). We then have: 

Proposition 4.5 For any subset ACT, any boundary configuration r}, any site y £ d A and any 
neighbor z e A of y, we have 

where R = Jp, y =* -+) an ^ _A mc fr° n Kp * s defined by 

1 1 

K PW = e -2f3 a + 1 " e 2/3 a + 1 - 

Proof: First, w.l.o.g. we may assume that the edge between y and z is the only one connecting y 
to A; this is because a tree has no cycles, so once the spin at y is fixed A decomposes into disjoint 
components that are independent. We also assume w.l.o.g. that the spin at y is (+) in rj, and we 
abbreviate ji\ and fi A to fi A and pT A respectively, and also fi A ,y ~* to fi* A . Thus ||//^ — /j, A \\ z = 

\H+A<r z = +)- yr A (o z = +)|, and R = 4^^4. We write R+ for ^=~\ and Br for ^=~) 

Since the only influence of y on A is through z, we have R + = e~ 2l3 R and R~ = e 2l3 R. The 
proposition now follows once we notice that, by definition of R + and Br, n A (cr z = +) = an d 

H2(*z = +) = k^+T- D 

Now it is easy to check that Kp{a) is an increasing function in the interval [0, 1], decreasing in 
the interval [1, 00], and is maximized at a = 1. Therefore, we can always bound k and 7 from above 
by Kp(l) = ■ Indeed, for 7 we must make do with this crude bound because it has to hold 

for any boundary configuration rj and we cannot hope to gain by controlling the magnetization R. 
However, as we shall see, for k we can do better in some cases by computing the magnetization at 
the root; when this differs from 1 we get a better bound than Kp{\). 

We are now ready to proceed to the proof of Theorem l4.lt 
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(i) Arbitrary boundary conditions 

Here, the boundary condition r is arbitrary and we first consider the (easy) case when j3 < 0o 
or \h\ > h c (/3) (i.e., h is super-critical). In this case we do not need to resort to the calculation 
of k and 7. As discussed in the Introduction, in this regime there is a unique infinite volume Gibbs 
measure, so certainly the variation distance at the root max^/ ||//g — n n B \\ x goes to zero as £ 
increases. In fact, it is not too difficult to see that in the above regime this variation distance goes to 
zero exponentially fast, which directly implies the desired exponential decay of correlations (VM) 
by plugging the bound on the variation distance into expression (1151) in the proof of Theorem l4.3l 
We go on to consider the more interesting regime when /3o < (3 < f3\ (i.e., intermediate tem- 
peratures) and the external field h is arbitrary. Here we use the fact that k < 7 < Kp(l). We then 

certainly have 7^6 < 1 whenever Kp(l) = < J~^, i.e., whenever e~ 2/3 > From the 

definition of f}\ (see Section ITTT1) . this corresponds precisely to f3 < f3\. (Observe how this non- 
trivial result drops out immediately from our machinery, as expressed in the condition 7 2 < |.) 
This completes the verification of Theorem 14. II part (i). 

(ii) (+)-boundary condition 

We now assume that r is the all-(+) configuration and consider arbitrary j3 and h. For convenience, 
we assume h > —h c (f3) since the case \h\ > h c {(3) was covered in part (i) for all boundary condi- 
tions t. The important property of the regime h > —h c {(3) is that, for the (+)-boundary the spin at 
the root is at least as likely to be (+) as it is to be (— ). We will show that 7^6 < 1 throughout this 
regime. Recall that we already showed that 7 < Kp(l) < 1 for all finite j3. It is therefore enough to 
show that k < |. 

To calculate k, we need to bound the variation distance — ^ \\ z , which by Proposition 14.51 
is equal to Kp{R z ), where R z = ^ z ^f z =+ l and Hj, is the Gibbs distribution over the subtree T z 

when it is disconnected from the rest of T and the spins on its bottom boundary agree with r. We 
thus have k = sup T max 2g r Kp{R z ). 

The final ingredient we need is a recursive computation of the magnetization R z , the details of 
which (up to change of variables) can be found in (2) or 0. Let y -< z denote that y is a child of z. 
A simple direct calculation gives that R z = e~ 2l3h Y[ y<z F{Ry), where F(a) = Fp{a) = *-2$*+ x • In 
particular, if z is any site on the bottom-most level of T, then since the spins of the children of z are 
all set deterministically to (+), we get that R z = e~ 2/3h [F(0)] b . We thus define 

J(a)^Jp A (a) = e- 2ph [F(a)} b (16) 

and observe that, for any z G T, R z = J^(0), where stands for the ^-fold composition of J, 
and £ is the distance of z from the bottom boundary of T. 

We now describe some properties of J that we use (refer to Fig.EJ) : J is continuous and increas- 
ing on [0, 00), with J(0) = e - 2/3 ^ +6 ) > and sup a J (a) = e~ 2fi{h -^ < 00. This immediately implies 
that J has at least one fixed point in [0, 00); we denote by ao the least fixed point. Since tio is the 
least fixed point and J(0) > then clearly J'(oo) < 1> where J'(a) = 9J J^ is the derivative of J. 
We also note that ao < 1 when h > —h c {(3), which corresponds to the fact that for the (+)-boundary 
and the above regime of h, the spin at the root is at least as likely to be (+) as (— ). 

Now, since J is monotonically increasing and ao is the least fixed point of J, clearly J^(0) 
converges to ao from below, i.e., R z < ao for every z G T. Thus, since ao < 1 for h > —h c ({3), and 
the function Kp(a) is monotonically increasing in the interval [0,1], Kp(R z ) < Kp(ao) for every 
zeT. 

What remains to be shown is that Kp(ao) < |. This follows from the fact that J'(ao) < 1, 
together with the following lemma: 

Lemma 4.6 Let ao be any fixed point of J. Then Kp(ao) = \ ■ J'(ao). 
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Figure 2: Curve of the function J (a), used in the proof of Theorem 14.11 for /3 > (3q and various 
values of the external field h. (i) h = —h c ((3); (ii) h c (/3) > h > —h c (f3); (hi) h > h c (0). The point ao 
is the smallest fixed point of J. 



Proof: From the definitions of J and F we have: 



J'(oq) 



-2/3h 



b-iFiao^F'iaa 



b ■ J(a ) 
b ■ a 



F'jao) 
F(a ) 
F'(a ) 



F(a ) 



b ■ a 



1 



-4/3 



.( ao + e -2/3)( e -2/3 ao + 1) _ 

= b-K f3 (a ). □ 
This completes the verification of Theorem 14. II part (ii). 



5 Verifying spatial mixing for log-Sobolev 

In this section we will prove a uniform lower bound (independent of re) on the logarithmic Sobolev 
constant c so b(/^) in all the situations covered by Theorem ll.2l in the Introduction. 

In light of Theorem 13.41 to show c so b = 0(1) we need only prove the validity of the Entropy 
Mixing condition EM(£, [(1 — 5)p m \ n /2(l+\— 5)} 2 ) for some constants I and 5 independent of the size 
of T. In order to establish EM in the situations covered by Theorem 11.21 we extend the coupling 
framework developed in Section |4~T1 so that it can be used to establish EM. As before, we will use 
a condition on the constants k and 7, which were defined in Section l4~Tl In fact, the condition on k 
and 7 for establishing EM is practically the same as the one that was used to establish VM, which 
immediately transfers our 0(1) bound on c gap for the relevant parameters to an 0(1) bound on c so b 
for the same choice of parameters. The main result of this section is the following relationship 
between (k, 7) and EM. 

Theorem 5.1 Any Gibbs distribution (j, = (j,^ satisfies ~EM(£, c(^a) e ^ 5 )for all i, where a = max 1}, 
k and 7 are the constants associated with the sequence {/U^} as specified in Definition [4.2\ and cis a 
constant that depends only on (b, f3, h). In particular, if max {jncb, 7} < 1 then there exists a constant $ 
such that, for every T, the measure \i = satisfies EM(£, ce~ M )for all I, and hence c so b(/^) = 0(1). 

Remark: We should note that the above theorem, like its counterpart for the spectral gap, holds for any spin 
system on a tree (with the definitions of n and 7 generalized appropriately). See the companion paper [31 ] 
for details. 
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Since in Section lT2l we have already calculated k and 7 for the regimes of interest and shown 
that in both cases max {7^6, 7} < 1, we have: 

Corollary 5.2 In both of the following situations, c so b(/^) = £7(1); 

(i) t is arbitrary, and either f3 < j3\ (with h arbitrary), or \h\ > h c (f3) (with (3 arbitrary); 

(ii) t is the (+)-boundary condition and j3, h are arbitrary. 

This completes the proof of our second main result, Theorem 11.21 stated in the Introduction. 

The first step in proving Theorem 15.11 is a reduction of EM to a certain strong concentration 
property of /i, the Gibbs measure under consideration. We believe that this concentration property, 
as well as its connection to EM, may be of independent interest. The statement of this property 
and the reduction of EM to it is the content of Section |5~T1 Then, in Section l5~2l we complete the 
proof of Theorem l5.1l bv relating the strong concentration property to k and 7. 

It is worth mentioning that we are also able to establish a general (but cruder) bound on c so b 
as a function of c gap . Specifically, we can show that c so b = ^(1/ log n) x c gap . Although we do not 
need this bound in this paper, we present it in Section 15.31 for future reference since its proof is 
simple and short. 



5.1 Establishing EM via a strong concentration property. 

In this subsection we reduce EM to a certain strong concentration property of fi. In the next 
subsection, we will then establish this strong concentration property as a function of k and 7 in 
order to prove Theorem 15.11 For simplicity and without loss of generality, we will analyze the 
entropy mixing condition only for T x = T (the whole tree), with root r. 

Let lit and denote the Gibbs distributions on T with the spin at the root r set to (+) and (— ) 
respectively (the boundary condition on the leaves of T being specified by r) . Define 

a {j \ ^ (a) { l /P if"r = (+), 
y+{ ' n{o) \ otherwise, 

where p = p(a r = +). The key quantity we will work with in the sequel is the following: 



0+ = MB r > W 



Note that g+(<r) depends only on the spins in dB r £. Indeed, let a r ^ stand for the restriction of a 

to dB r £, i.e., to the sites at distance I below r. It is easy to verify that g^f} \<j) is equal to Mt / r ' 1 ) . 

Thus, for a given configuration a, gy (a) is the ratio of the probabilities of seeing the spins of a at 
level £ below the root r when the spin at r is (+) and when there is no condition on the spin at r, 
respectively. We define g_ and g_ in an analogous way. 

The role played by the functions g^ and g_ is embodied in the following theorem, which says 
that if these functions are sufficiently tightly concentrated around their common mean value of 1 
then the entropy mixing condition EM holds. 

Theorem 5.3 There exists a constant c (depending only on b, (3 and h) such that, for any 5 > 0, if 

MD^-IIM < e " 2/5 C17) 

for s G {+, — }, then we have Ent[^(/)] < c5 Ent(f) for any non-negative function f that does not 
depend on B r ^; in particular, EM(£, cS) holds. 
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Proof: Fix £ < m and a non-negative function / that does not depend on the spins inside the 
block B rj £. Since Ent(/') < Var(/')//i(/ / ) for every non-negative function /' (see, e.g., IT36H ) then 

^f(f)} < Va r [ ^ ( /i )] = -T7T- t[4(/)-M/)] 2 + (l-p)[^(/)-M(/)] 2 " 



1 



T 

2 



P Cov( 5+ ,/) 2 + (l-p) Cov( 5 _,/) 2 



< max K ,' JJ , (18) 



where Cov denotes covariance w.r.t \i. Now observe that, since / does not depend on B r ^, when 

computing the covariance term in (1181) the function g s can be replaced by gy, which depends only 
on the spins in dB r £. Thus, if we can show that (II 7D implies 

Cov(#,/) 2 <c^(/)Ent(/) (19) 

for some constant c, then by plugging ( I19D into J18D we will get that Ent[/^(/)] < c<5Ent(/), as 
required. 

To establish ( I19D we make use of the following technical lemma, whose proof can be found in 
Section[7J 

Lemma 5.4 Let JF, zy} be a probability space and let fa be a mean-zero random variable such that 
||/i||oo < 1 and v[\fa\ > ^ e^ 2 / 5 for some 5 £ (0,1). Let fa be a probability density w.r.t. v, 
i-e. /2 > and f(/2) = 1. Then there exists a numerical constant d > independent of v, f\, fa and 
5, such that v{f 1 fa) 2 < d 5Ent u (fa). 

We apply this lemma with v = fj, and 



1/7. 



« I CO ' 



to deduce Cov (gi e \f) 2 < c'<5||c^ ) ||^ /i(/)Ent(/). Noting also that \\gr\\oo < \\g s \\oo < 1/Pmin, 
where p m j n was defined just before Theorem 13.41 this establishes ( I19D with c = c'/p mm and thus 
completes the proof of the theorem. □ 



5.2 Proof of Theorem 101 



In light of Theorem 15.31 to prove Theorem 15.11 it is sufficient to verify the strong concentration 
property (II 7D of the functions gy with 5 = (ja) 1 / 5 . 

In order to do this we appeal to a strong concentration of the Hamming distance under the 
coupling v of and as defined in the proof of Claim RT4l Recall the notation used in that 
claim, and notice that the Hamming distance is dominated by the size of the population in the £th 
generation of a specific branching process. The following tail bound can be obtained using standard 
techniques from the analysis of branching processes, and we defer the proof to the end of this 
section. 



Lemma 5.5 Let a = max{«;6, 1}. Then for every C > 0, 



Pr 



a — a 



> Co? 



f 1 V 2eJ 



Corollary 5.6 For every C > and s G {+, —}, 



Pr 



gf\a)-gf\a') > C(ja) 



< e 



1 (l Pmin C \ 
t+1 \ X 2e } 
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Proof: It is enough to show that 



biV)-siV)l<-^>-^ (20) 

Pmin 

since we can then apply Lemma 1531 with C replaced by p m \ n C. On the other hand, (12 0D follows 
from part (ii) of Claim l4~4l once we recall that gi £ \a) = n a B (g s ) and that g s depends only on the 

spin at the root, implying that \gf\o) - grW)\ < \\V>B r , t ~ ^B,,J\r • lbs Ilex, < 7* W ~ v'\r,e/Pmm- 
□ 

Before we go on with the proof of Theorem l5.11 let us compare the way we used the constants k 
and 7 in the proof of Corollarv l5.6l to the way we used them in the proof of Theorem 14.31 In both 
cases we used k and 7 to get bounds for coupling "down" and "up" the tree respectively. Specifically, 
we used k to deduce that the Hamming distance between the coupled configurations at the Mi level 
is about (Kb) e , and we then used 7 to bound the effect of each discrepancy at the Ah level on the 
spin at the root (or equivalently, on g^) by roughly j e . While in Theorem 14.31 it was enough that 
the average Hamming distance when coupling down the tree was bounded by (/«&/, here we need 
that this distance is not much larger than {nb) 1 with high probability. 

We now return to the proof of Theorem l5.ll W.l.o.g. we may assume that 7a < 1 since EM(£, 1) 
always holds, and also that 7a > since if 7 = then EM(£, 0) holds because then the spin at the 
root r is independent of the rest of the configuration. Let a = (7a) -1 > 1. Recall that we wish to 
establish (II 7t with 5 = a~ e ^ 5 for all large enough I. We will show only that 

»[gf)-l>8\< l -e-^ (21) 

since the same bound on the negative tail can be achieved by an analogous argument. 
We start by applying Corollary 15 .61 with C = <// 4 to get that, for every e > 0, 

4 \$> - 1 > e] < n [g® - 1 > e - a~^] + A, (22) 

where A = e l + lK ^ 2e > and we have used the fact that \i is a convex combination of [i~ and \i ~ . 
Next, we notice that by definition of gp , 

4 [# - 1 > e] > (1 + s)^ [gM - 1 > e] . (23) 

Combining (1221) and (1231) we get that, for every e > 0, 

A* [# " 1 > e] < (3-^) ( M H £) -1>s- a-^} + A) . (24) 
This immediately yields that, for every non-negative integer k and e > 0, 

V.[gV> -l>e+ ka- u ^] < (1 + e)-^ + A , (25) 

where we applied (I24D k + 1 times, each time increasing e by ar u / A . 

Inequality (1211) then follows (assuming £ is large enough) by applying (1251) with e = a~^/ 4 and 
k = \a 1 / 2 ] . This concludes the proof of Theorem l5.ll □ 

Finally, we supply the missing proof of Lemma l5~5l 
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Proof of Lemma 15.51 First notice that, by an exponential Markov inequality, it is enough to 

for all t < (2e(jt + l)^)^ 1 < 1. We thus fix t as above and 



show that E,, 



let D n 



E, 



e t\a-cr'\ r j 
e t\o-a'\ x 



< e 



2eta t 



Note that D T , can be calculated 



, where v is the coupling of /i~ and fj>~ . ±nuh_ l ±±cil ±^ X)l 
recursively as follows. The main observation is that, given a disagreement at x, the random variable 
\c — (r'\ x ,i is the sum of the b independent random variables \a — o*\ z ,i-i where z ranges over the 
children of x. In turn, the random variable e*l°"~ CT I"- 1 takes the value D Z) j_i with probability at 
most k (the probability of a disagreement at z given a disagreement at x) and the value 1 with 
the remaining probability (since \a — <7'\ z ,i-i = if there is no disagreement at z). Thus, if we let 



max x D x i 



1, then 5 i+ i < [1 + nSif - 1 < e KbS * - 1 < e a5 > - 1. We wish to show that, for t 



in the above range, 5e < 2eta l , which implies E p 



e t\cr-cr'\ r 



< 



+ 1 < e 



2eta l 



, as required. In fact, 



we show by induction that Si < 2t[^p • af for every < i < £. For the base case i = 0, notice that 
|c — c'|x,o = 1 when starting from a fixed disagreement at x, so 5q = e* — 1 < 2t for t in the given 
range. For i + 1 > 0, we use the fact that 5i + \ < e a5i — 1 < < ^ • a^i, since by the induction 



hypothesis Si < 



a{£+l) 



for all < i < £ — 1 and t in the given range. □ 



5.3 A crude bound on log-Sobolev via the spectral gap 

In this section we state and prove a general bound on c so b using a bound on c gap . Although we do 
not require this bound for the results in this paper, we believe that it may find applications in the 
future. We state the bound for the Ising model, but it can be easily verified that it generalizes to 
any nearest-neighbor spin system on a tree. 

Theorem 5.7 For the Ising model on the b-ary tree, c so b(/i) = c gap (/u) x 0(1/ log n). In particular, if 
Cgap(^) = 0(1) then c sob (/u) = 0(1/ log n). 

It is useful to compare this bound with the well-known bound c so b(^) = c gap (/i) x 0(l/n) (see, 
e.g. J36IO . which though much weaker is also more general (for example, it applies to spin systems 
on any graph) . 

Theorem 15 .71 is a consequence of the following lemma. 

Lemma 5.8 For any (3 and h, there exists a constant c = c(b, /3, h) such that, for any x e T and all £, 




(26) 



This lemma immediately implies Theorem 15.71 once we notice that c gap (/i^) > d ■ c gap (/i^) for a 
constant d = d(b, (3, h) and every x e T and 7/ e OJ,, as can easily be checked. 

Proof of Lemma \S.S\ For simplicity and w.l.o.g. we will prove the recursive inequality (1261) only 
for T x = T (the whole tree), with root r. Let / be a non-negative function. We then write (using 
the entropy version of ©) 

Ent(/) = M [Enty (/)] + Ent[^(/)] . (27) 
Using the definition of c ao b we have 

M[Ent f (/)] < max {c sob (/i^ ^/ifVar^^v 7 /)] 

< max {csob^r 1 }^//)- (28) 
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The second term on the r.h.s. of (12 71) . being the entropy of a Bernoulli random variable, is bounded 
above by 



Ent[/i f (/)] < aVar(y/i f (/)J (29) 

< ac gap (^)- 1 P( x /7), (30) 

where a = is a constant that depends on p = /i(o> = +); specifically a(p) = ^sMlz£l f or 
p / 1/2, and a(l/2) = 1/2 (see (361). 

Putting together (1281) and (1301) . the expression in (1271) is bounded above by 

max {c sob (/4 + acgap^i)- 1 V(^), 

so that from the definition of c so b we have 

Csob(^) -1 < max {c so b(/*4 + aCgapCA*)" 1 - n 



6 Extensions to other models 

As we have already indicated, our techniques extend beyond the Ising model to general nearest- 
neighbor interaction models on trees, including those with hard constraints. In this final section 
we mention some of these extensions. For a fuller treatment of this material, the reader is referred 
to the companion paper H31H . 

A {nearest neighbor) spin system on a finite graph G = (V, E) is specified by a finite set S of spin 
values, a symmetric pair potential U : S x S ^ RU {oo}, and a singleton potential W : S — ► R. A 
configuration a G S v of the system assigns to each vertex (site) v & V a spin value a v e S. The 
Gibbs distribution is given by 

H(a) oc exp [- [J2 xyeE U ( a *> a v) + J2 xeV W ^)_ ■ 

Thus the Ising model corresponds to the case S = {±1}, and U(si,S2) = — (3siS2, W(s) = —f3hs, 
where j3 is the inverse temperature and h is the external field. Note that setting U(s\,S2) = oo 
corresponds to a hard constraint, i.e., spin values s\,S2 are forbidden to be adjacent. We denote 
by the set of all valid spin configurations, i.e., those for which fj,(a) > 0. 

As for the Ising model, we allow boundary conditions which fix the spin values of certain sites. 
We carry over our notation from the Ising model: thus, e.g., n T A denotes the Gibbs distribution on a 
subset A C V with boundary condition r on OA. 

The (heat-bath) Glauber dynamics extends in the obvious way to general spin systems. We first 
note that, as the reader may easily check, neither the spatial mixing conditions in Section |3] nor 
their proofs made any reference to the details of the Ising model. All of this material therefore 
carries over without modification to general spin systems on trees. 

Theorem 6.1 The statements of theorems \3.2\ and \3.4\ hold for general nearest-neighbor spin systems 
on trees. 

Likewise, the machinery developed in Sections and [5] for verifying the conditions VM and 
EM also extends to general models, though the details of the calculations are model-specific. In 
particular, Theorems 14.31 and 15.11 relating VM and EM to the coupling quantities k and 7 of Def- 
inition 14.21 still hold (with very minor modifications) . Thus all we need to do is to carry out the 
detailed calculations of k and 7 for the model under consideration. We now state without proof 
the results of these calculations for several models of interest. For the proofs, together with further 
discussion and extensions, the reader is referred to the companion paper 113 111 . 
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6. 1 The hard-core model (independent sets) 

In this model S = {0, 1}, and we refer to a site as occupied if it has spin value 1, and unoccupied 
otherwise. The potentials are 

U(l, 1) = oo; 17(1, 0) = 17(0, 0) = 1; W(l) = L; W(0) = 0, 

where Lei. The hard constraint here means that no two adjacent sites may be occupied, so f2 
can be identified with the set of all independent sets in G. Also, the aggregated potential of a valid 
configuration is proportional to the number of occupied sites. Hence the Gibbs distribution takes 
the simple form 

p{a) oc X N( - a) , 

where N(a) is the number of occupied sites and the parameter A = exp(— L) > 0, which controls 
the density of occupation, is referred to as the "activity." 

The hard-core model on a 6-ary tree undergoes a phase transition at a critical activity A = Ao = 
k (see, e.g., (39j|23ll). For A < A there is a unique Gibbs measure regardless of the boundary 
condition on the leaves, while for A > Ao there are (at least) two distinct phases, corresponding to 
the "odd" and "even" boundary conditions respectively. The even boundary condition is obtained 
by making the leaves of the tree all occupied if the depth is even, and all unoccupied otherwise. 
The odd boundary condition is the complement of this. (These boundary conditions are derived 
from the two maximum-density configurations on the infinite tree 1* in which alternate levels — 
either odd or even — are completely occupied.) For A > Ao, the probability of occupation of the 
root in the infinite-volume Gibbs measure differs for odd and even boundary conditions. Relatively 
little is known about the Glauber dynamics for the hard-core model on trees, beyond the general 
result of Luby and Vigoda 11271 14311 which ensures a mixing time of O(logn) (after translation to 
our continuous time setting) when A < . This result actually holds for any graph G of maximum 
degree 6 + 1. 

Our results for the Glauber dynamics in the hard-core model mirror those given earlier for the 
Ising model. First, for sufficiently small activity A we show that both c gap and c so b are uniformly 
bounded away from zero for arbitrary boundary conditions. Second, for even (or, symmetrically, 
odd) boundary conditions, we get the same result for all activities A. 

Theorem 6.2 For the hard-core model on the n-vertex b-ary tree with boundary condition t, c ga , p (p) 
and c S ob(^) ore 0(1) in both of the following situations : 

(i) r is arbitrary, and A < max j ^7=— j- , Ao j ; 

(ii) r is even (or odd), and A > is arbitrary. 

Part (ii) of this theorem is analogous to our earlier result for the Ising model with (+)-boundary 
and zero external field at all temperatures. This is in line with the intuition that the even boundary 
eliminates the only bottleneck in the dynamics. Part (i) identifies a region in which the mixing time 
is insensitive to the boundary condition. We would expect this to hold throughout the low-activity 
region A < Ao, and indeed, by analogy with the Ising model, also in some intermediate region 
beyond this. Our bound in part (i) confirms this behavior: note that the quantity }_ x exceeds Ao 

for all b > 5, and indeed for large b it grows as ^= compared to the | growth of Ao. Thus for b > 5 
we establish rapid mixing in a region above the critical value Ao- To the best of our knowledge this 
is the first such result. (Note that the result of H271 143H mentioned earlier establishes rapid mixing 
for A < j^j, which is less than Ao for all b and so does not even cover the whole uniqueness region.) 
We should also mention that our coupling analysis of c gap in this region has consequences for the 
infinite volume Gibbs measure itself, implying that when A < - any p = limr-^oo Pj> that is the 
limit of finite Gibbs distributions for some boundary configuration r is extremal, again a new result. 
We elaborate on these points in the companion paper 113 111 . 
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6.2 The antiferromagnetic Potts model (colorings) 

In this model S = {1, 2, . . . , q}, and the potentials are U(si,S2) = (38 sljS2 , W(s) = 0. This is the 
analog of the Ising model except that the interactions are antiferromagnetic, i.e., neighbors with 
unequal spins are favored. The most interesting case of this model is when j3 = oo (i.e., zero 
temperature), which introduces hard constraints. Thus if we think of the q spin values as colors, 
ft is the set of proper colorings of G, i.e., assignments of colors to vertices so that no two adjacent 
vertices receive the same color. The Gibbs distribution is uniform over proper colorings. In this 
model it is q that provides the parameterization. For background on the model, see J8) • 

For colorings on the 6-ary tree it is well known that, when q < b + 1, there are multiple Gibbs 
measures; this follows immediately from the existence of "frozen configurations," i.e., colorings in 
which the color of every internal vertex is forced by the colors of the leaves (see, e.g., (8J). Recently 
Jonasson [21] proved that, as soon as q > b + 2, the Gibbs measure is unique. Moreover, it is known 
that there is again an "intermediate" region that includes the value q = b + 1, in which the Gibbs 
measure, while not unique, is insensitive to "typical" boundary conditions (chosen from the free 
measure); see J8). 

The sharpest result known for the Glauber dynamics on colorings is due to Vigoda [42], who 
shows that for arbitrary boundary conditions the mixing time is O(logn) provided q > ^(b + 1). 
Actually this result holds for any n-vertex graph G of maximum degree b + 1.** Our techniques 
extend this rapid mixing result all the way down to the critical value q > b + 2 for which uniqueness 
holds, with arbitrary boundary conditions. Again, our result is a consequence of the fact that the 
associated log-Sobolev constant is bounded below by a constant independent of n: 

Theorem 6.3 For the colorings model on the n-vertex b-ary tree with q > b+2 and arbitrary boundary 
conditions, both c gap (//) and c so b(/^) are f2(l). 



6.3 The ferromagnetic Potts model 

Here we have S = {1, 2, . . . , q] and potentials U(si, S2) = — /?5 sljS2 , W(s) = 0. This is a straightfor- 
ward generalization of the (ferromagnetic) Ising model studied earlier in the paper, in which the 
spin at each site can take one of q possible values, and the aggregated potential of any configuration 
depends on the number of adjacent pairs of equal spins. There are no hard constraints. 

Qualitatively the behavior of this model is similar to that of the Ising model, though less is 
known in precise quantitative terms. Again there is a phase transition at a critical /3 = fio, which 
depends on b and q, so that for f3 > (3q (and indeed for f3 > Po when q > 2) there are multiple 
phases. This value fio does not in general have a closed form, but it is known H16H that (3q < 
\ ln( b ^^ L 1 ) for all q > 2. (For q = 2, this value is exactly f3$ for the Ising model as quoted earlier.) 

Using our techniques, we are able to prove the following: 

Theorem 6.4 For the Potts model on the n-vertex b-ary tree, c gap (//) and c so b(M) are ^00 ^ n a ^ °f^ ne 
following situations: 

(i) the boundary condition is arbitrary and (5 < max |/3o, \ ln( ^ + ^ )|; 

(ii) the boundary condition is constant (e.g., all sites on the boundary have spin 1) and f3 is arbitrary; 

(iii) the boundary is free (i.e., the boundary spins are unconstrained) and (3 < Pi, where /?i is the 
solution to the equation f^'^ ■ = \. 



"A recent sequence of papers 1121 13211171 have reduced the required number of colors further for general graphs, 
under the assumption that the maximum degree is fi(log n); the current state of the art requires q > (1 + e)(6 + 1) for 
arbitrarily small e > [ 18 1 . However, these results do not apply in our setting where the degree b + 1 is fixed. 
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Part (i) of this theorem shows that c gap and c so b are 0(1) for arbitrary boundaries throughout 
the uniqueness region; also, since \ ln( ^ + ^ ) > \ ln( ^jjjj^ ) > (3q when q < 2(s/b + 1), this result 
extends into the multiple phase region for many combinations of b and q. Part (ii) of the theorem 
is an analog of our earlier results for the Ising model with (+)-boundaries at all temperatures. 
Part (iii) is of interest for two reasons. First, since j3\ > f3o always, it exhibits a natural boundary 
condition under which c gap and c so b are 0(1) beyond the uniqueness region (but not for arbitrary j3) 
for all combinations of b and q. Second, because of an intimate connection between the free 
boundary case and so-called "reconstruction problems" on trees 113 3 II (in which the edges are noisy 
channels and the goal is to reconstruct a value transmitted from the root), we obtain an alternative 
proof of the best known value of the noise parameter under which reconstruction is impossible [13411 . 
Indeed, a slight strengthening of part (iii) allows us to marginally improve on this threshold. Again, 
we spell out the details in 113 111 . 

7 Proofs omitted from the main text 

In this final section, we supply the proofs of some technical lemmas that were omitted from the 
main text. 

7. 1 Proof of Lemma 1331 

The lemma in fact holds in a more general setting, where in place of T x and B x j we think of two 
arbitrary subsets A, B such that A U B = T x . Also, in this proof we write v = ^ and Var and Ent 
for variance and entropy with respect to v. For part (i) we will show that if for any function g that 
does not depend on B we have Var [1/^(5)] < e ■ Var(g), then for any function /, 

VarM/)] < . uiVaxBif)] + ^ • i/[Var A (/)]. 

Notice that by the convexity of variance we have Var(<?i + 52) < 2 [Var (51) + Var (52)] for any 
two functions gi,g2- We therefore write 

VarM/)] = v™[v A (f)-MMf)) + MMf)} 

< 2VarM/ - ubU))] + 2Var[z^ B (/))] 

< 2Var[/-i/ B (/)] + 2eVar[ I / fl (/)] 

= 2i/[Varfl(/)] + 2e(VarM/)] + i/[Var A (/)] - */[Var B (/)]), 

where we used the facts that Var[/ — pbU)] = ^[VareCf)] and that Var[i/^(/)] + ^[Var^(/)] = 
Var[i/ B (/)] + u[Vav B (f)] = Var(/) as in ©. We therefore conclude that Var[z^ A (/)] < ' 

u\Vax B (f)] + • ^[Var A (/)], as required. 

We proceed to part (ii) . Here we have to show that if for any non-negative function g that does 
not depend B we have Ent[z^(fiO] < £ ■ Ent (5), then for any non-negative function /, 

EntM/)] < • f[Entfl(/)] + ■ ^[Ent A (/)], (31) 

where e' = \/e/p and p stands for the minimum non-zero probability of any configuration in T x \ A. 
We will in fact show that 

Ent(/) < r ^- 7 ( J ,[Ent A (/)] + v[Ent B (/)]), (32) 

which implies since Ent[u A (f)] = Ent(/) - v[Ent A (f)}. 
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Before we go on with the proof, let us review some properties of entropy. First, by definition, 



Ent(/) = i/C/logj^j) and v[Ent A (f)} 
entropy we have v A (f log 

We can now proceed with the proof of (1321) by writing 



v(f log irfm )- A1s°j by the variational characterization of 
< Ent A (f) for all non-negative functions / and g. 



Ent(/) 



< v 



/log 
/log 



/ 



Mf) 

f 

Mf) 



+ u 



+ v 



/log 
/log 



Mf) 
MMf)) 
f 

Mf) 



+ v 



+ v 



fiogMMf)) 

M) 
MMf)) 



u[Ent B (f)} + v[Exit A (f)}+p 



Mf) l °t 



/log 

K/) 

MMf)) 
M) 



Therefore, (1321) will follow once we show that v 
following claim in order to get this bound. 



^(/)log " A i% (/)) 1 - e ' Ent (/)- We use the 



Claim 7.1 Let /i be a probability measure over a space f2 where the probability of any a e Q is either 
zero or at least p. Then for any two non-negative functions f and g over $7 we have 



/log- 



< 



i M/) 
p V Ms) 



Ent(/)-Ent( 5 ), 



where Ent is tafcen w. r. t. to p. 

Assuming Claim I7TT1 we conclude that 

MMf)) 



Mf) lo £ 



1 



"(/) 



1 



< -y/Ent[v A (f)]-Ent[v A (Mf))] < 



1 



-y/e-Ent[v A (f)]-Ent[v B (f)] < -^Ent(/), 
p P 

completing the proof of Lemma l3~5l We note that, since neither v A {f) nor v A {yB{f)) depends on A, 
the effective probability space in the above derivation is the marginal over T x \ A, so indeed p can 
be taken as the minimum marginal probability of configurations restricted to T x \ A. 

It remains to prove claim l7.il Consider two arbitrary non-negative functions / and g. Let x be 
the indicator function of the event that g > fJ,(g). Clearly, x^°S > while (1 — x) log y^y < 0. 



Also, since p 



< log// 



Ms) 



< 



then p. [(1 - x) log^§y 

and / m j n be the maximum and minimum values of / respectively over configurations with non-zero 
probability, we get: 



. Letting f n 
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' II/-m(/)IIi 



Ms) 
S 



1 
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Ms) 
lls-Ms)lli 



Ent(/) •Ent(p), 
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where we wrote || • ||i for the £\ norm with respect to fi and used the fact that ||/ — fJ.(f)\\1 < 
2//(/)Ent(/) for any non-negative function / (see, e.g., 13610 . The proof of Claim I71T1 is now 
complete. □ 

7.2 Proof of reverse direction of Theorem 13.41 

In the main text we proved the forward direction of Theorem 13.41 Here we prove the reverse 
direction, i.e., that mm XyV c so b(/i~ ) = 0(1) implies EM(£,ce~ M ) for all £, where c = c(b,f3,h) 
and $ = $(b, f3, h) are constants independent of I. To do this, we follow the same line of reasoning 
as in the proof of Theorem 15.21 namely, we establish the strong concentration property of the 
functions gy as in Section 15.11 and then appeal to Theorem 15.31 The proof of concentration is 
accomplished via hypercontractivity bounds, assuming the above condition on c so t>. 

For a function /, let A/ C T denote the subset of sites on whose spins / depends. We then have: 



Lemma 7.2 Let v be any Gibbs measure on T, f any function, and B any subset that includes all sites 
within distance I from Af. Then there exists a constant •&', depending only on the degree b, such that 

IM/)-K/)ll« < ae-^M^lA/iiiz-K^IU, 

where q = 1 + e Csob ( v )' d ' e and norms are taken w. r. t. v. 

We first assume Lemma l7^2l and complete the proof of the reverse direction of Theorem 13 .41 

For simplicity, we verify EM only for the case T x = T (the whole tree), with root r. Recall the 
functions gf* from Section l5~Tl the fact that gy = HB rt {gs) by definition, and that g s depends only 
on the spin at r. Applying Lemma I7T21 with v = \i, f = g s , and B = B r £, together with the fact 
that c so b(^) = 0(1) by hypothesis, we conclude that there exists a constant •&" such that 

||# - l|| g < Ze-»"% s - 1|U < Se-^Aw, 

where q = 1 + e® " e and norms are taken w.r.t. fj,. Therefore, using a Markov inequality, there exist 
constants £q and # such that, for all £ > £q, 

4H i) -l\>e- m ] < e~^ e . 

This establishes the strong concentration property of gy as in (II 7D . from which EM follows by 
Theorem l5.3l □ 

Remark: A similar claim to Lemma I7i2l was proved in II41II in the context of Z d ; we reprove it below for 
completeness. The proof, as well as the fact that a £1(1) logarithmic Sobolev constant implies EM(£, ce~ M ), 
applies to general, finite range models on any graph of bounded degree. 

Proof of Lemma T7.2t The proof has two main ingredients: the first is a bound on the speed at 
which information propagates under the Glauber dynamics, while the second is a standard rela- 
tionship between c so b and hypercontractivity bounds. 

Let P t = e tc stand for the transition kernel at time t (as discussed in SectionEJ) of the dynamics 
under consideration, reversible w.r.t. the Gibbs measure v, and let P t B stand for the transition kernel 
of a modified dynamics where the spins of the sites outside the subset B are fixed to their values at 
time zero (the sites inside B being updated according to the same rule as in the original dynamics) . 
It is well known (see, e.g., 114 110 that there exists a constant ko depending only on b (or on the 
degree of the graph in the general case) and the maximum flip rate max x Hc^Hoo (which is bounded 
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by 1 in the case of the heat bath dynamics) such that, for any function /, any t and any subset B 
that includes all sites within distance hot of Af, 

||P t /-P t B /IU<2 e - t |A / |||/|| 0O . (33) 

Equation (1331) is a manifestation of the fact that it takes at least ^ time before the spin at a site 
can become sensitive to the configuration at distance I from it. 

The second ingredient we need is a hypercontractivity bound. From Gross's integration lemma 
(see, e.g., (Tjl), we have ||Pt/|| g < H/H2 for any mean-zero function /, any t, and 2 < q < 1 + e Csob *, 
where c so b = c so b(^)- Adding to this the fact that c gap > c so b> we may write 

W\\ q = \\Pt/2(Pt/2f)\\ q < \\P t /2fh < e- c ^ t/2 \\fh < e- Cs ° bi/2 ||/|| 2 , (34) 

where q = 1 + e Csob */ 2 and we used the fact that c gap bounds the rate of decay of the L 2 norm. 

We now conclude the proof of Lemma I7T21 as follows. Without loss of generality, consider an 
arbitrary function / with u(f) = 0. Let i be arbitrary, and B be a subset that includes all sites 
within distance i of Af. Then, for t = £/ko and q = 1 + e Csob */ 2 , we have 

\\Mf)\U = \\MP t B f)\U 

< \\Pt B f\U 

< \\pff-p t f\\ q + \\p t f\\ g 

< 2e-*|A / |||/|| 00 + e -«^bt/2 

< 3e-^ M \Af\\\f\\oo, 
taking the constant 1? = 1/2/co (and using the fact that c so b < !)• □ 

7.3 Proof of Lemma 15.41 

We split our analysis of v(fif2) 2 into three cases: 

(a) Ent„(/ 2 ) > i; 

(b) S < Ent„(/ a ) < i; 

(c) Ent,(/ 2 ) < 5. 

Case (a). We simply bound 

K/1/2) 2 < H/illLK/2) 2 < 1 < 5 Ent„(/ 2 ). 
Case (b). We use the entropy inequality (see, e.g., 01), which states that for any t > 0, 

K/1/2) < \ log^(e^) + - t Ent v (/ 2 ) . (35) 



We choose the free parameter t in (1351) equal to -y/Entj,^)/^. Notice that, by construction, 1 < 
t < S^ 1 . Using the assumption u(\fi\ > 5) < e~ 2 l & together with ||/i||oo < 1j we g et 



^(/i/ 2 ) 2 < [j log(e tS + e l ~ 2 ' 5 ) + V<JEnt v (/ 2 ) 



< 



Cl 5 + v^Ent^/a) < c 2 5 Ent„(/ 2 
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for suitable numerical constants c\, c%. 



Case (c). Again we use the entropy inequality with t = A /Ent I/ (/2)/<5 < 1, but we now simply bound 
the Laplace transform ^(e*^ 1 ) by a Taylor expansion (in t) up to second order: 



\ogv{e^) < log(l + e-K/i 2 )) < e-[5 2 + e 



-2/8 ■ 



which by (1351) implies 

K/1/2) 2 < 



L[5 2 + e- 2 / s ]^Ent u (f2)/S, 



• 5 2 + e -2/5 ) + v^l Ent„(/ 2 ) < c 3 5Ent^(/ 2 ) 



for another numerical constant C3. 



□ 
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